230 research outputs found

    Initial fixation placement in face images is driven by top-down guidance

    Get PDF
    The eyes are often inspected first and for longer period during face exploration. To examine whether this saliency of the eye region at the early stage of face inspection is attributed to its local structure properties or to the knowledge of its essence in facial communication, in this study we investigated the pattern of eye movements produced by rhesus monkeys (Macaca mulatta) as they free viewed images of monkey faces. Eye positions were recorded accurately using implanted eye coils, while images of original faces, faces with scrambled eyes, and scrambled faces except for the eyes were presented on a computer screen. The eye region in the scrambled faces attracted the same proportion of viewing time and fixations as it did in the original faces, even the scrambled eyes attracted substantial proportion of viewing time and fixations. Furthermore, the monkeys often made the first saccade towards to the location of the eyes regardless of image content. Our results suggest that the initial fixation placement in faces is driven predominantly by β€˜top-down’ or internal factors, such as the prior knowledge of the location of β€œeyes” within the context of a face

    Heterochrony and Cross-Species Intersensory Matching by Infant Vervet Monkeys

    Get PDF
    Understanding the evolutionary origins of a phenotype requires understanding the relationship between ontogenetic and phylogenetic processes. Human infants have been shown to undergo a process of perceptual narrowing during their first year of life, whereby their intersensory ability to match the faces and voices of another species declines as they get older. We investigated the evolutionary origins of this behavioral phenotype by examining whether or not this developmental process occurs in non-human primates as well.We tested the ability of infant vervet monkeys (Cercopithecus aethiops), ranging in age from 23 to 65 weeks, to match the faces and voices of another non-human primate species (the rhesus monkey, Macaca mulatta). Even though the vervets had no prior exposure to rhesus monkey faces and vocalizations, our findings show that infant vervets can, in fact, recognize the correspondence between rhesus monkey faces and voices (but indicate that they do so by looking at the non-matching face for a greater proportion of overall looking time), and can do so well beyond the age of perceptual narrowing in human infants. Our results further suggest that the pattern of matching by vervet monkeys is influenced by the emotional saliency of the Face+Voice combination. That is, although they looked at the non-matching screen for Face+Voice combinations, they switched to looking at the matching screen when the Voice was replaced with a complex tone of equal duration. Furthermore, an analysis of pupillary responses revealed that their pupils showed greater dilation when looking at the matching natural face/voice combination versus the face/tone combination.Because the infant vervets in the current study exhibited cross-species intersensory matching far later in development than do human infants, our findings suggest either that intersensory perceptual narrowing does not occur in Old World monkeys or that it occurs later in development. We argue that these findings reflect the faster rate of neural development in monkeys relative to humans and the resulting differential interaction of this factor with the effects of early experience

    The Natural Statistics of Audiovisual Speech

    Get PDF
    Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2–7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver

    Cross modal perception of body size in domestic dogs (Canis familiaris)

    Get PDF
    While the perception of size-related acoustic variation in animal vocalisations is well documented, little attention has been given to how this information might be integrated with corresponding visual information. Using a cross-modal design, we tested the ability of domestic dogs to match growls resynthesised to be typical of either a large or a small dog to size- matched models. Subjects looked at the size-matched model significantly more often and for a significantly longer duration than at the incorrect model, showing that they have the ability to relate information about body size from the acoustic domain to the appropriate visual category. Our study suggests that the perceptual and cognitive mechanisms at the basis of size assessment in mammals have a multisensory nature, and calls for further investigations of the multimodal processing of size information across animal species

    Monkeys and Humans Share a Common Computation for Face/Voice Integration

    Get PDF
    Speech production involves the movement of the mouth and other regions of the face resulting in visual motion cues. These visual cues enhance intelligibility and detection of auditory speech. As such, face-to-face speech is fundamentally a multisensory phenomenon. If speech is fundamentally multisensory, it should be reflected in the evolution of vocal communication: similar behavioral effects should be observed in other primates. Old World monkeys share with humans vocal production biomechanics and communicate face-to-face with vocalizations. It is unknown, however, if they, too, combine faces and voices to enhance their perception of vocalizations. We show that they do: monkeys combine faces and voices in noisy environments to enhance their detection of vocalizations. Their behavior parallels that of humans performing an identical task. We explored what common computational mechanism(s) could explain the pattern of results we observed across species. Standard explanations or models such as the principle of inverse effectiveness and a β€œrace” model failed to account for their behavior patterns. Conversely, a β€œsuperposition model”, positing the linear summation of activity patterns in response to visual and auditory components of vocalizations, served as a straightforward but powerful explanatory mechanism for the observed behaviors in both species. As such, it represents a putative homologous mechanism for integrating faces and voices across primates

    Social interactions through the eyes of macaques and humans

    Get PDF
    Group-living primates frequently interact with each other to maintain social bonds as well as to compete for valuable resources. Observing such social interactions between group members provides individuals with essential information (e.g. on the fighting ability or altruistic attitude of group companions) to guide their social tactics and choice of social partners. This process requires individuals to selectively attend to the most informative content within a social scene. It is unclear how non-human primates allocate attention to social interactions in different contexts, and whether they share similar patterns of social attention to humans. Here we compared the gaze behaviour of rhesus macaques and humans when free-viewing the same set of naturalistic images. The images contained positive or negative social interactions between two conspecifics of different phylogenetic distance from the observer; i.e. affiliation or aggression exchanged by two humans, rhesus macaques, Barbary macaques, baboons or lions. Monkeys directed a variable amount of gaze at the two conspecific individuals in the images according to their roles in the interaction (i.e. giver or receiver of affiliation/aggression). Their gaze distribution to non-conspecific individuals was systematically varied according to the viewed species and the nature of interactions, suggesting a contribution of both prior experience and innate bias in guiding social attention. Furthermore, the monkeys’ gaze behavior was qualitatively similar to that of humans, especially when viewing negative interactions. Detailed analysis revealed that both species directed more gaze at the face than the body region when inspecting individuals, and attended more to the body region in negative than in positive social interactions. Our study suggests that monkeys and humans share a similar pattern of role-sensitive, species- and context-dependent social attention, implying a homologous cognitive mechanism of social attention between rhesus macaques and humans

    Millisecond-Timescale Local Network Coding in the Rat Primary Somatosensory Cortex

    Get PDF
    Correlation among neocortical neurons is thought to play an indispensable role in mediating sensory processing of external stimuli. The role of temporal precision in this correlation has been hypothesized to enhance information flow along sensory pathways. Its role in mediating the integration of information at the output of these pathways, however, remains poorly understood. Here, we examined spike timing correlation between simultaneously recorded layer V neurons within and across columns of the primary somatosensory cortex of anesthetized rats during unilateral whisker stimulation. We used Bayesian statistics and information theory to quantify the causal influence between the recorded cells with millisecond precision. For each stimulated whisker, we inferred stable, whisker-specific, dynamic Bayesian networks over many repeated trials, with network similarity of 83.3Β±6% within whisker, compared to only 50.3Β±18% across whiskers. These networks further provided information about whisker identity that was approximately 6 times higher than what was provided by the latency to first spike and 13 times higher than what was provided by the spike count of individual neurons examined separately. Furthermore, prediction of individual neurons' precise firing conditioned on knowledge of putative pre-synaptic cell firing was 3 times higher than predictions conditioned on stimulus onset alone. Taken together, these results suggest the presence of a temporally precise network coding mechanism that integrates information across neighboring columns within layer V about vibrissa position and whisking kinetics to mediate whisker movement by motor areas innervated by layer V
    • …
    corecore